Arabic Handwritten Word Category Classification Using Bag of Features
نویسندگان
چکیده
Human writing is highly variable and inconsistent, and this makes the offline recognition of handwritten words extremely challenging. This paper describes a novel approach that can be employed for the offline recognition of handwritten Arabic words. Through conceptualizing each word as single, inseparable objects, the proposed approach aims to recognize words in accordance with their complete shape. This paper describes the bag-of-visual-words method that has been effectively employed for the purposes of classifying images. The study consisted of four main stages. First, a set of image patches were sampled for the purposes of training, and a speeded up robust features (SURF) descriptor was then used to characterize them. Following that, the bag-of-visual-words model was employed through constructing the K-means clustering algorithm. A histogram of each whole world was developed and this operated as the image feature vector. This was employed to train the support vector machine classifier, which was then able to effectively distinguish between handwritten words. Finally, the effectiveness of the proposed method was tested using a sample of Arabic words extracted from the IFN/ENIT database and the results indicated that the bag-of-visual-words approach represents a promising method of recognizing and classifying handwritten Arabic words. The best and average recognition rates of the proposed method are 85% and 75% respectively.
منابع مشابه
Holistic Farsi handwritten word recognition using gradient features
In this paper we address the issue of recognizing Farsi handwritten words. Two types of gradient features are extracted from a sliding vertical stripe which sweeps across a word image. These are directional and intensity gradient features. The feature vector extracted from each stripe is then coded using the Self Organizing Map (SOM). In this method each word is modeled using the discrete Hidde...
متن کاملClassification of Personal Arabic Handwritten Documents
This paper presents a novel holistic technique for classifying Arabic handwritten text documents. The classification of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several structural and statistical features are extracted from these connected ...
متن کاملHolistic Approach for Classifying and Retrieving Personal Arabic Handwritten Documents
This paper presents a novel holistic technique for classifying and retrieving Arabic handwritten text documents. The retrieval of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several features are extracted from these connected parts and then co...
متن کاملWord-Based Handwritten Arabic Scripts Recognition Using Dynamic Bayesian Network
In this paper, multi-class classification system is of handwritten Arabic words using Dynamic Bayesian Network (DBN) is proposed, in which technical details are presented in terms of three stages, i.e. preprocessing, feature extraction and classification. Firstly, words are segmented from inputted scripts and also normalized in size. Then, features are extracted from each normalized word, where...
متن کاملWord Spotting in Handwritten Arabic Documents Using Bag-Of-Descriptors
This paper presents a query-by-example word spotting in handwritten Arabic documents, based on Scale Invariant Feature Transform (SIFT), without using any text word or line segmentation approach, because any errors affect to the subsequent word representation. First the interest points are automatically extracted from the images using SIFT detector, then, we use SIFT descriptor to represent eac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016